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attack are disclosed. The automated tool injects a tracer 
value into both GET and POST form data, and monitors 
the resultant HTML to determine whether the tracer val- 
ue is returned to the local machine by the serverto which 
it was sent. If the tracer value is returned, the automated 



tool attempts to exploit the web site by injecting a non- 
malicious script as part of an input value for some form 
data, based on the location in the returned HTML in 
which the returned tracer value was found. If the exploit 
is successful, as indicated by the non-malicious script, 
the automated tool logs the exploit to a log file that a 
user can review at a later time, e.g., to assist in debug- 
ging the web site. 
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Description 

FIELD OF THE INVENTION 

5 [0001] The invention relates generally to computers and computer networks. More specifically, the invention relates 
to computer security and prevention of malicious attacks on a computer system by a hacker by providing for the auto- 
matic detection of a web site's vulnerability to cross-site scripting type attacks. 

BACKGROUND OF THE INVENTION 

10 

[0002] The proliferation of the Internet has created a massive venue for computer hackers to attempt to disrupt web 
services, cripple company web sites, and exploit users' private personal information. Typical types of Internet-based 
attacks include buffer overflow attacks, denial-of-service attacks, and a newer class of attack termed a cross site 
scripting (XSS, previously known in the art as CSS) attack. 
15 [0003] Cross site scripting attacks exploit a server that echoes some user supplied data back to the user's client 
computer over HTTP or HTTPS. For example, suppose a CGI script accepts as input a person's name, such as is 
illustrated in Figure 1 . The CGI script might return to the client computer an HTML document that displays a message, 
directed to that person, such as is illustrated in Figure 2. The echoed data is boldfaced in Figure 2 for illustrative 
purposes only. 

20 [0004] A malicious user such as a hacker might be able to exploit this echoing feature to execute malicious code on 
a client computer. For example, a malicious user might persuade an inattentive user to click on a hyperlink correspond- 
ing to a URL such as is shown in Figure 3. The malicious user might send the inattentive user an innocent-looking link 
in an unsolicited email, or might maintain a web page that many people want to visit, e.g. , advertising information about 
a popular celebrity. In either scenario (email or web page), a hyperlink is supplied that corresponds to a URL such as 

25 is shown in Figure 3 (GET request) or an HTML form is pre-populated with malicious form data (POST request). The 
CGI script, upon execution at the server, returns to the client an HTML documents such as is illustrated in Figure 4. 
The echoed data is boldfaced in Figure 4 for illustrative purposes only. Because the user's Web browser receives the 
evil JavaScript from the trusted Web page (goodguy.com), the Web browser will execute the script and allow access 
to anything to which goodguy com would otherwise have access, e.g., a cookie with the user's personal login and 

30 password, account information, credit card information, etc. 

[0005] The ability to execute : on a user's local computer, a script appearing to originate, from a trusted web site, but 
that in fact originates from a malicious user, is a serious security vulerability. For example, the simple script alert (doc- 
ument cookie) will pop up an alert dialog box displaying the user's current set of cookies for goodguy.com. One of skill 
in the art will appreciate that a malicious user can do much more serious damage, including stealing passwords or 

55 other personal information stored in a cookie (e.g., credit card information), or redirecting the user to another (malicious) 
Web site. 

[0006] While solutions for preventing cross site scripting attacks have been proposed, e.g., by performing validation 
on received input to ensure that the input does not contain any malicious code, or encoding characters with special 
meaning in HTML, there is presently no way to automate testing of a Web site for susceptibility, to cross site scripting 
40 attacks. In order to test for cross site scripting vulnerabilities, a tester must manually submit test data to a Web server 
in the form of URLs with various test data. This manual testing is tedious and consumes unnecessary resources (i.e., 
man-hours). 

[0007] Thus, it would be an advancement in the art to provide an automated solution for testing a Web site for 
susceptibility to cross site scripting type attacks. It would be a further advancement in the art to provide an automated 
45 software testing tool that checks not only for simple cross site scripting vulnerabilities, but also tests for susceptibility 
to advanced cross site scripting attacks. It would be a further advancement in the art if the automated software tool 
were able to use the same engine used by a common web browser to ensure that the site being tested will perform 
exactly as when a user visits the web site using the common web browser. 

50 BRIEF SUMMARY OF THE INVENTION 

[0008] To overcome limitations in the prior art described above, and to overcome other limitations that will be apparent 
upon reading and understanding the present specification, the present invention is directed to an automated software 
tool that detects a vulnerability of a web site to a cross site scripting attack. The automated software tool submits a 
55 tracer value as input to a web site, and monitors the web page returned by the web site as a result of submitting the 
tracer value. When the tracer value is present in the returned web page, the automated software tool knows that the 
web site might be vulnerable to a cross site scripting (XSS) attack. To confirm whetherthe web site is indeed vulnerable 
to a XSS attack, based on the location of the returned tracer value, the automated software tool submits a signaling 
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script as input to the web site, and monitors the subsequently returned web page to determine whether the signaling 
script is executed by the local computer when the subsequently returned web page loads on the local computer. If the 
script is executed, the automated software tool knows that the web site is vulnerable to a XSS attack corresponding 
to the format of the script submitted based on the determined location of the tracer value. 

BRIEF DESCRIPTION OF THE DRAWINGS 



[0009] A more complete understanding of the present invention and the advantages thereof may be acquired by 
referring to the following description in consideration of the accompanying drawings, in which like reference numbers 
10 indicate like features, and wherein: 

[0010] Figure 1 illustrates a URL including expected user input 

[0011] Figure 2 illustrates HTML produced as a result of submitting the URL illustrated in Figure 1 . 
[0012] Figure 3 illustrates a URL in which malicious code has been injected. 

[0013] Figure 4 illustrates HTML produced as a result of submitting the URL illustrated, in Figure 3. 
15 [0014] Figure 5 illustrates an execution environment according to an illustrative embodiment of the invention: 

[0015] Figure 6 illustrates a screenshot of an automated software tool according to an illustrative embodiment of the 
invention. 

[0016] Figure 7 illustrates a screenshot of a Web site to be tested for vulnerability to a cross site scripting attack 
according to an illustrative embodiment of the invention. 
20 [0017] Figure 8 illustrates a method for testing a Web site for susceptibility to cross site scripting attacks according 
to an illustrative embodiment of the invention. 

[0018] Figure 9 illustrates a test URL used by the automated software tool according to an illustrative embodiment 
of the invention. 

[0019] Figure 10 illustrates HTML returned by a server after the automated software tool has attempted an exploit 
25 according to an illustrative embodiment of the invention. 

[0020] Figure 11 illustrates an alert window displayed by the automated software tool according to an illustrative 
embodiment of the invention. 

[0021 ] Figure 1 2 illustrates a portion of a log file generated by the automated software tool according to an illustrative 
embodiment of the invention. 

30 

DETAILED DESCRIPTION OF THE INVENTION 



[0022] In the following description of the various embodiments; reference is made to the accompanying drawings, 
which form a part hereof, 'and in which is shown by way of illustration various embodiments in which the invention may 
35 be practiced. It is to be understood that other embodiments may be utilized and structural and functional modifications 
may be made without departing from the scope of the present invention . 



GENERAL OPERATING ENVIRONMENT 



40 [0023] With reference to Figure 5, an exemplary system for implementing the invention includes a computing device, 
such as computing device 100. In its most basic configuration, computing device 100 typically includes at least one 
processing unit 102 and memory 104. Depending on the exact configuration and type of computing device, memory 
1 04 may be volatile (such as RAM), non-volatile (such as ROM, flash memory, etc.) or some combination of the two. 
This most basic configuration is illustrated in Figure 5 by dashed line 106. Additionally, device 100 may also have 

45 additional features/functionality. For example, device 1 00 may also include additional storage (removable and/or non- 
removable) including, but not limited to, magnetic or optical disks or tape. Such additional storage is illustrated in Figure 
5 by removable storage 108 and non-removablestorage 110. Computer storage media includes volatile and nonvolatile, 
removable and non-removable media implemented in any method or technology for storage of information such as 
computer readable instructions, data structures, program modules or other data. Memory 1 04, removable storage 1 08 

50 and non-removable storage 1 1 0 are all examples of computer storage media. Computer storage media includes, but 
is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks 
(DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage 
devices, or any other medium which can be used to store the desired information and which can accessed by device 
100. Any such computer storage media may be part of device 100. 

55 [0024] Device 1 00 may also contain communications connection(s) 1 1 2 that allow the device to communicate with 
other devices. Communications connection(s) 112 is an example of communication media. Communication media 
typically embodies computer readable instructions, data structures, program modules or other data in a modulated 
data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The 
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term "modulated data signal" means a signal that has one or more of its characteristics set orchanged in such a manner 
as to encode information in the signal. By way of example, and not limitation, communication media includes wired 
media such as a wired network or direct-wired connection, and wireless media such as acoustic, Bluetooth, RF, infrared 
and other wireless media. The term computer readable media as used herein includes both storage media and com- 
5 munication media. 

[0025] Communication connection(s) 112 allow device 100 to communicate with remote devices, e.g., server 120, 
via one or more networks 1 1 8. Server 1 20 may be an application service provider's server for providing a Web service, 
a Web server for accessing one or more Web pages of a Web site, or any other server to which device 1 00, acting as 
a client machine, may access. Network(s) 118 may include any number and type of wired and/or wireless networks, 
10 including by way of example, the Internet, corporate intranets, LANs, WANs, and the like. 

[0026] Device 1 00 may also have input device(s) 1 14 such as keyboard, mouse, pen, voice input device, touch input 
device, etc. Output device(s) 1 1 6 such as a display, speakers, printer, etc. may also be included. All these devices are 
well know in the art and need not be discussed at length here. 

[0027] Figure 5 illustrates an example of a suitable operating environment 1 00 in which the invention may be imple- 
15 mented. The operating environment 1 00 is only one example of a suitable operating environment and is not intended 
to suggest any limitation as to the scope of use or functionality of the invention. Other well known computing systems, 
environments, and/or configurations that may be suitable for use with the invention include, but are not limited to, 
personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based 
systems, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed com- 
20 puting environments that include any of the above systems or devices, and the like. 

[0028] The invention may be described in the general context of computer-executable instructions, such as program 
modules, executed by one or more computers or other devices. Generally, program modules include routines, pro- 
grams, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data 
types when executed by a processor in a computer or other device. The computer executable instructions may be 
25 stored on a computer readable medium such as a hard disk, optical disk, removable storage media, solid state memory, 
RAM, etc. As will be appreciated by one of skill in the art, the functionality of the program modules may be combined 
or distributed as desired: in various embodiments. In addition, the functionality may be embodied in whole or in part 
in firmware or hardware equivalents such as integrated circuits, field programmable gate arrays (FPGA), and the like. 

30 ILLUSTRATIVE EMBODIMENTS 

[0029] With reference to Figures 6-12, the invention provides an automated software testing tool that checks a Web 
site's susceptibility to cross site scripting (XSS) attacks by hackers and other malicious users. Figure 6 illustrates a 
main menu screenshot for the automated software tool, indicating various functionalities and options available to the 
55 user of the automated software tool. Figure 7 illustrates a screenshot of a Web page for which a user desires to de- 
termine the susceptibility to cross site scripting type attacks. Figure 8 illustrates a method for determining susceptibility 
to cross site scripting attacks as performed by an illustrative embodiment of the automated software tool. Each figure 
is explained in more detail, below. 

[0030] Figure 6 illustrates a main menu 601 for the automated software tool, according to an illustrative embodiment 
40 of the invention. After launching the automated software tool, a user may enter in input box 603 the host name corre- 
sponding to the web site for which the user desires to test XSS vulnerability. Similarly, the user enters the specific URI 
data, port number, and form data in input boxes 605, 607, and 609, respectively. The URI data 605 refers to the data 
after the domain name portion of the URL. The port number 607 may be any port accessible on a remote server. 
However, by default, port 80 is used unless otherwise specified, as port 80 corresponds to Web browser traffic. Form 
45 data 609 describes the form fields that host 603 expects to receive through URI 605 on port 607. 

[0031] After entering the above Web site information, the user may specify further options, such as selecting via 
input box 611 whether secure socket layers (SSL) should be used. For example, some web pages are only accessible 
over HTTPS (a secure connection). In these cases, the user should specify SSL. For example, a login page to a web 
site will likely only work over SSL for security reasons. The user may also select via input box 613 whether testing 
50 should be performed using the GET or POST method. The GET method, generally, refers to including data for variables 
in the URL sent to a server. The POST method, generally, refers to submission of form data via a POST method through 
a form field of a web page. Once the Web site information and any associated options have been entered, the user 
launches the testing process by selecting the 'Run' button 615. 

[0032] Alternatively to manually entering information for each website, the user may store in a data file information 
55 corresponding to multiple Web sites. The user can automatically test each Web site, without being required to manually 
enter the information for each Web site into menu 601 , by selecting 'Run File' button 61 7. Upon selecting button 61 7, 
the user may be prompted for a file name for the file containing the information corresponding to the multiple Web 
sites. Alternatively, the user may be required to store the information in a file with a predetermined name, e.g. , 'webdata. 
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csv', and the automated software tool retrieves the information from the predetermined file upon selection of button 61 7. 
[0033] According to another alternative, the user may simply copy a U RL to the clipboard (not shown) of the operating 
system, e.g., Microsoft WINDOWS® brand operating system, which the automated software tool then parses to de- 
termine the proper Web site information for input fields 603, 605, 607, and 609. For example, suppose a user wants 

5 to test the Web site 701 illustrated in Figure 7. The user can copy the URL 703 to the clipboard by highlighting URL 
703 and either selecting Edit, Copy from menu bar 705, or typing the corresponding shortcut Ctrl-C. Referring back to 
Figure 6, the user may then select the 'Parse Clipboard' button 61 9, which causes the automated software tool to parse 
the current contents of the clipboard and automatically populate input fields 603, 605, 607, and 609 with the hostname, 
URI, port, and form data, respectively, as is illustrated in Figure 6. If the current contents of the clipboard do not cor- 

10 respond to a Web site, the automated software tool displays an error (not shown) to the user. After successful parsing 
the clipboard's contents, the user may select button 615 to test Web site 701 . The testing process will be explained in 
more detail with reference to Figure 8, below. 

[0034] If the user desires to interrupt the automated software tool before it has finished the automated testing process, 
the user may select 'Abort Run' button 625, which is optionally only available to the user for selection during the testing 

15 process (as shown). After the automated software tool has completed testing the web site, the user may view the 
resulting log file by selecting 'View Log' button 623. If the user does not need to maintain the log file or otherwise 
desires to delete the log file, the user can clear the log file by selecting 'Clear Log' button 621 . 
[0035] Other optional information may also be presented on the main menu 601 . For example, status indicator 627 
may indicate whether the automated software tool is ready to test a Web site (as shown), waiting for input from the 

20 user, parsing the clipboard, parsing a file containing information corresponding to multiple Web sites, testing, and the 
like. During and/or subsequent to testing a Web site, indicator 629 may indicate how many tests the current Web site 
has passed, and indicator 631 may indicate how many tests the current Web site has failed (i.e., different ways in which 
the automated software tool has exploited, the current Web site, and is thus susceptible to XSS attacks). 
[0036] Figure 8 illustrates a method for testing a Web site for susceptibility to XSS-type attacks, according to an 

25 illustrative embodiment of the invention. Initially, in step 801 , a user navigates to a Web site to be tested. In step 803, 
the user copies the URLoftheWeb site to the clipboard, and in step 805. selects the 'Parse Clipboard' button on the 
automated software tool, causing the automated software tool to parse the clipboard and populate the Web. site infor- 
mation fields as described above. The automated software tool starts testing when the user selects the 'Run' button 
in step 807. One of skill in the art will appreciate that, alternatively to steps 801-807, a user may manually enter infor- 

30 mation in the main menu of the automated software tool (Figure 6) or may specify in an input file multiple Web sites 
to test, as described above. For illustrative purposes only only one Web site being tested is described herein. 
[0037] In step 809, the automated software tool opens a new browser window in which to test the subject Web site. 
Instep 811, the automated software tool parses the form data 609 (Figure 6) into key-value pairs. Each key-value pair 
is separated with an ampersand (&), while the name of the key and its corresponding value are separated with an 

55 equals sign (=). In the example illustrated in Figures 6 and 7, each key has a null value. However, if the value "hacking" 
were specified for the key NGSearch, it might look like "email45=&emailaddr=&NGSearch=hacking&SearchType=&...." 
[0038] The automated software tool proceeds to check the web site's vulnerability to XSS attacks based on each 
key-value pair. That is, the automated software tool initially tests vulnerability based on the first key-value pair, then 
proceeds to the second key-value pair, etc., until all key-value pairs have been tested. Thus, in step 81 3 the automated 

40 software tool selects the first (or next) key-value pair, referred to as the current key-value pair. 

[0039] For each current key-value pair, the automated software tool performs steps 815-835. In step 815, the auto- 
mated software tool injects a plain text tracer, e.g., CSSTESTTAG, as the value of the current key-value pair, and 
submits the resulting URL via the new browser window opened in step 809. For example, assuming the current key- 
value pair is the first key-value pair, the automated software tool initially submits in step 815 the URL illustrated in 

45 Figure 9. In step 817, the automated software tool receives HTML data back from the Web server to which the URL 
was submitted (e.g., server 1 20, Figure 5), and the automated server tool scans the HTML to check whether the tracer 
value was returned by the server. In one illustrative embodiment of the invention, in order to ensure that the web site 
being scanned will behave exactly as in a user's browser window, the automated software tool scans the document 
object model (DOM) of the web page. The DOM of the returned HTML web page is exposed by Internet Explorer's 

50 HTML Rendering Engine (e.g., using mshtmt.dll, dispex.dll, iesetup.dll, mshtml.tlb, mshtmled.dll; and mshtmler.dll). If 
the tracer value is not found in the DOM then the automated software tool determines in step 819 whether any key- 
value pairs are left and, if so, returns back to step. 813. If there are no remaining key-value pairs, the automated 
software tool proceeds to step 837. 

[0040] If the tracer value is not found within the HTML DOM, then the tool proceeds to step 819. However, if the 
55 tracer value is found within the HTML DOM, the automated software tool also, analyzes in step 81 7 where the tracer 
value was found. Next, in step 821 , the automated software tool determines whether the location in which the tracer 
value was found corresponds to a special case for which a known exploit exists. If so, the automated software tool 
attempts in step 823 to exploit the Web site's vulnerability to a cross site scripting attack based on the location in which 
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the tracer value was found in step 817. If no special case exists, the automated software tool proceeds to step 829 to 
attempt to exploit the web site using a different tracer value to inject a custom HTML tag into the DOM. 
[0041] Examples of locations that are indicative of known exploits include, but are not limited to, the following: 
[0042] 1 ) Tracer value returned as text displayed in body of web page. 

5 [0043] When the tracer value is returned as displayed text in the body of the returned web page, the automated 
software tool determines that a cross site scripting attack, if the web site is vulnerable, can be completed by injecting 
a script as-is as the value of the current-key value pair, meaning that the automated software tool can inject a script 
in the form "<script>exploit-script-goes-here;</script>" (without the enclosing quotation marks). 
[0044] 2) Tracer value returned in an HTML tag. 

10 [0045] When a tracer value is returned in an HTML tag, the resulting. HTML may appear similar to the following 
illustration where the tracer value was submitted as an input value: 

<INPUT type="text" value="CSSTESTTAG"> 

15 

[0046] However, one will note that the 'INPUT' tag is not closed prior to encountering the tracer value. Thus, the 
automated software tool determines that a cross site scripting attack, if the web site is vulnerable, can be completed 
by injecting a script preceded by tag closing indicia, eg., in the form "><SCRIPT>exploit-script-goes-here;</SCRIPT>" 
(without quotation marks). 

20 [0047] On some web sites, it might not be possible to close the preceding tag, but rather attributes can be added. 
Thus, the automated software tool may also attempt to add attributes to the value. For example where the tracer is 
returned in the HTML "<A HREF=www.test.com/default.asp?zip=> sending the zip code value "9021 0 onclick="exploit- 
script-goes-here;"" (without outside quotation marks). The resulting HTML might look like that illustrated in Figure 10. 
Clicking on the resulting link would then launch the injected script. 

25 [0048] 3) Tracer value returned as attribute of IMG or A HREF tag. 

[0049] "IMG," "A," and several other tags allow a URL to be specified as an attribute. Thus, the automated software 
tool determines that a cross site scripting attack, if the web site is, vulnerable, can be completed by injecting a script 
protocol as the URL. For example javascript:exploit-script-goes-here, vbscript:exploit-script-goes-here, etc. In some 
instances, a web site may use user-supplied input as a portion of an IMG SRC attribute. That is, a web site may base 

30 a resultant file name on user input, e.g., the input value 'CSSTESTTAG' may result in the tag "<IMG SRC=fooCSST- 
ESTTAG.jpg>". (without quotation marks). Thus, the automated software tool determines that a cross site scripting 
attack; if the web site is vulnerable, can be completed by injecting a script preceding the "jpg>" 
[0050] 4) Tracer value returned as part of a block of script. 

[0051 ] Sometimes the tracer value may be inserted into a block of script. In this case, it is not necessary to include 
55 the SCRIPT tag with the script input. This attack does require, however, that the returned data be a syntactically correct 
script. For example, a web page to tell users they are about to enter a part of the site where viewer discretion is advised 
may include a redirect value forthe page to which the user should be redirected. The URL forthe page might look like: 
[0052] http://www.test.com/blabla/acssrv.dll?action=acvaming&redir_url=%2Fisapi%2Facs 
srv%2Ed11%3Faction%3Derror%26commid%3D 
40 [0053] One of the parameters passed in the URL is the redir_url. The user-supplied data may be returned inside of 
a SCRIPT tag, such as "var Request_redir_url = 7isapi/acssrv:dll?action=error&commid=';". The value for the redir_url 
from the URL becomes the value of the JavaScript variable named Request_redir_url. The automated software tool 
determines that a cross site scripting attack, if the web site is vulnerable, can be completed by injecting a script at the 
end of the URL without SCRIPT tags, e.g., with a URL such as: 
45 [0054] http://www.test.com/isapi/acssrv.dl I ?action=acwaming&c=&redir_url=%2Fisapi%2F acss- 
i^%2Ed11%3Faction%3Derroi^/o26commid%3D';some-evil-JavaScript-goes-here;var%20strBogus='gotcha 
[0055] When the automated software tool sends the above URL, the following is returned to the browser inside of 
the SCRIPT tag: 

50 

"var Request_redir_url. = 7isapi/acssrv.dll?action=error&commic=';aleri("Cross- 
siteScriptingVulnerabilityFoundByCSSProbe.");var strBogus=' gotcha" 

55 [0056] The automated software tool successfully injects the script (i.e., alert("Cross-siteScriptingVulnerabilityFound- 
ByCSSProbe.")), by including "var strBogus='gotcha" after it because some web servers (e.g., MSN) always append 
";" after the user-supplied data. The automated software tool makes the script syntactically correct by declaring a new 
variable named strBogus so as to avoid a script error. 
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[0057] 5) Tracer value is returned in HTML comments. 

[0058] Sometimes the user-supplied data (i.e., the tracer value) is returned inside of an HTML comment, e.g., the 
CGI script returns an error and uses the user input for debugging purposes. However, the comment field is not closed 
prior to encountering our tracer value CSSTESTTAG. Thus, the automated software tool determines that a cross site 
5 scripting attack, if the web site is vulnerable, can be completed by injecting a script preceded by comment closing 
indicia, e.g., in the form "-> <SCRIPT>exploit-script-goes-here;</SCRIPT>" (without quotation marks). 
[0059] 6) Tracer value not located in a location for which there is a known exploit 

[0060] In some instances the tracer value may be found in the DOM in a location for which there is not a known 
exploit. In such situations, the automated software tool may attempt to exploit the web site by injecting a STYLE attribute. 
10 While the automated software tool may inject other attributes based on the tracer value being found in other specific 
locations, the STYLE attribute is used as a default, or fall-back, attempt because the STYLE attribute is common to 
many HTML tags. Thus, the automated software tool may attempt various exploits by injection the STYLE attribute in 
various formats, such as 1 STYLE=, " STYLE=, and the like. 

[0061] The above scenarios are meant by way of illustration only, and should not be interpreted as limiting the au- 
15 tomated software tool to only those scenarios. Those of skill in the art will appreciate that other scenarios may be 
known or later discovered, and the automated software tool may be adapted to account for such other scenarios. 
[0062] Referring back to Figure 8, the automated software tool attempts in step 823 to exploit the web site by injecting 
a non-malicious script as described above. In an illustrative embodiment of the invention, a script such as 
<SCRIPT>alert("Cross-siteScriptingVulnerabilityFoundByCSSProbe.")</SCRIPT> is used. Any other readily identifia- 
20 ble text may alternatively be used. Thus, if the exploit is successful, the resulting HTML returned to the browser window 
will cause a pop up window to display the text "Cross-siteScriptingVulnerabilityFoundByCSSProbe," such as is illus- 
trated in Figure 11 . In step 825, the automated software tool determines whether the exploit succeeded by monitoring 
the local system for the pop up window. If the pop up window is not displayed within a predetermined amount of time, 
e.g., by the time the returned HTML has finished loading or shortly thereafter, the automated software tool determines 
25 that the exploit was unsuccessful and proceeds to step 829. If, however, the pop up window is displayed within the 
predetermined amount of time, the automated software tool determines that the exploit was successful, and proceeds 
to step 827 where exploit data is written to the log file with information sufficient to identify the type of XSS attack to 
which the web site is susceptible. The log file may be reviewed by a user at a later time, e.g., to assist in debugging 
the subject web site. 

30 [0063] For example, in one embodiment, with reference to Figure 12, the log file is written as an HTML document 
which a user can review upon completion of the testing cycle. Each entry in the log file may appear as a row in a table, 
and entries may be color coded to indicate whether each web page is vulnerable or safe. Those of skill in the art will 
appreciate that the log file may be any type of file sufficient to indicate to a user the type of exploit to which each Web 
page is vulnerable. After logging the exploit in step 827, the automated software tool returns to step 819. 

35 [0064] In step 829, the automated software tool injects a tag-based tracer value, based on the results of the exploits 
attempted in step 823. For example, if the automated software tool does not find the tracer value, in step 81 7, in a tag 
attribute that it knows how to exploit, the automated software tool will attempt to add a tag to the DOM, e.g., <CSST- 
ESTTAG>. The automated software tool sends <CSSTESTTAG> first because it is typically the most likely tag to 
succeed at exploiting a web site. The automated software tool may also attempt to exploit the web site using tags such 

40 as "><CSSTESTTAG>, '><CSSTESTTAG>, and other syntactic variations attempting to get the tag into the resulting 
DOM. Once the tag is found within the DOM, the automated software tool replaces <CSSTESTTAG> with the script 
tag and the non-malicious script exploit code, in step 831 the automated software tool checks the returned HTML and/ 
or DOM to determine, whether the server returned the new tracer tag in the returned web page. If so, the automated 
software tool proceeds to step 833. If not, the automated software tool proceeds to step 81 9. 

45 [0065] In step 833, because the new tracer value was returned by the server, the automated software tool attempts 
to exploit the web site by sending the exploit script, again based on the location of the new tracer value in the returned 
web page. In step 835, the automated software tool determines whether the exploit was a success based on whether 
the pop up window appears within the predetermined amount of time. If the exploit in step 833 was a success, the 
automated server tool proceeds to step 827 where the exploit is logged to the log file. If the exploit is not a success, 

50 the automated software tool proceeds to step 819. 

[0066] In step 837 the automated software tool attempts to exploit the web site by inserting an exploitable attribute 
into the DOM. That is, some web sites, while not explicitly advertising that they accept key-value pairs, will accept a 
key-value pair if one is submitted. For example, a web site with the URL 'http:llwww.test.com/main.asp 1 does not appear 
to accept any key-value pairs. However, when the URL 'http://www.test. com/main. asp? n am e=CSSTESTTAG' is sub- 

55 mitted, the resulting HTML or DOM may contain the tracer value 'CSSTESTTAG.' Thus, the automated software tool 
determines that a cross site scripting attack, if the web site is vulnerable, can be completed by injecting a script based 
on the location of the attribute tracer value; as described above. The automated software tool, in step 839, determines 
whether the attribute attack was a success based on whether the pop up window appears within the predetermined 
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amount of time. If the attack is a success, the automated software tool logs the exploit in step 841 , similar to logging 
step 827. 

[0067] Those of skill in the art will appreciate that the method illustrated in Figure 8 may be modified without departing 
from the scope and spirit of the invention. That is, some steps in figure 8 may be performed in other than the recited 

5 order, one or more steps may be optional, additional steps may be inserted, and some steps may be combined, all 
while performing substantially the same function as described above. For example, Figure 8 indicates that once an 
exploit, for a key-value pair is found the automated software tool proceeds to the next key-value pair. This is because 
the automated software tool makes an intelligent decision as to the type of exploit required based on the location of 
the returned tracer value. That is, the automated software tool makes an intelligent decision whether the injected script 

10 should be preceded by ">,"."->" ".jpg>," ".jpg STYLE=," null, or the like, based on the tracer value's returned location. 
However, one of skill in the art will appreciate that, in an alternative illustrative embodiment, the automated software 
tool may attempt to exploit each key-value pair by iteratively, attempting each script format. That is, brute force, may 
alternatively be used. In yet another alternative illustrative embodiment, the automated software tool may halt testing 
as soon as a first vulnerability is detected. 

15 [0068] As illustrated in Figure 6, the automated software tool can also test a web site's vulnerability to XSS attacks 
using the known POST command. Often web pages containing forms through which a POST command can be sent 
include error checking within the form in the web page itself. That is, often error checking is performed locally on the 
client computer before the data is sent to the server. In such a scenario, included in the error checking may be validation 
of input data to ensure thatthe input values do not contain SCRIPT or other illegal or potentially malicious tags. However, 

20 in order to circumvent the locally-performed error checking, as would a malicious user, the automated software tool 
parses the form to determine the values actually returned to the server, and creates a duplicate page that performs no 
error checking. The automated software tool then performs the above analysis, such as is illustrated in Figure 8, on 
the web site using the duplicated web page that performs no error checking. 

[0069] The above-described automated software tool is a powerful tool to identify vulnerabilities in web sites. If the 
25 tool is obtained by a malicious user, the malicious user could exploit the automated software tool itself to identify 
weaknesses in various web sites. Thus, to prevent unauthorized or malicious users from taking advantage of the 
automated software tool's capabilities, in one illustrative embodiment of the invention the automated software tool, 
during initialization, checks to determine whether it is being executed from a predetermined, or home, network. For 
example, the automated software tool may check to determine that it is being executed from within a corporate intranet 
30 by confirming that it can ping a predetermined corporate serverthat is only accessible from within the corporate intranet. 
If the automated software tool cannot contact the predetermined corporate server, the automated software tool may 
shut down or refuse to test any web sites. 

[0070] Alternatively, in order to ensure that the automated software tool is not used on arbitrary web sites ; the auto- 
mated software tool maybe hardcoded to only test web sites falling within a predetermined list of domain names, hosts, 
35 and/or URIs. If the user attempts to test a web site not falling within the predetermined list of web sites, the automated 
software tool may shut down or refuse to test the desired web site. 

[0071] While the invention has been described with respect to specific examples including presently preferred modes 
of carrying out the invention, those skilled in the art will appreciate that there are numerous variations and permutations 
of the above described systems and techniques. Thus, the spirit and scope of the invention should be construed broadly 
40 as set forth in the appended claims. 



Claims 

45 1. A computer-performed method for automated detection of a cross site scripting vulnerability of a web site, com- 
prising: 

determining key-value pairs corresponding to the web site; 

for each determined key-value pair, at least until a first vulnerability is detected, performing a sub-method 
50 comprising: 

submitting the key-value pair to the web site; wherein the value of the key-value pair comprises a tracer 
value; 

receiving a web page responsive to the submitted key-value pair; 
55 determining a location of the tracer value, when, present, in the received web page; and 

when the tracer value is present in the received web page, submitting a second key-value pair to the web 
site, wherein the value of the second key-value pair comprises a script. 
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2. The computer-performed method of claim 1 , wherein the script, if executed, signifies that the web site is vulnerable 
to a cross site scripting attack, and wherein a format of the script-comprised value of the second key-value pair is 
based on the determined location of the tracer value. 

3. The computer-performed method of claim 1 , further comprising writing vulnerability, data corresponding to the web 
page to a log file, based on whether the script is executed. 

4. The computer-performed method of claim 2, wherein the location of the tracer value is determined based on a 
document object model of the web page. 

5. The computer-performed method of claim 4, wherein when the location of the tracer value is within displayed text 
of a body of the web page, the format of the script-comprised value begins with a script tag. 

6. The computer-performed method of claim 4, wherein when the location of the tracer value is within an HTML tag, 
*5 the format of the script-comprised value begins with indicia that closes the HTML tag. 

7. The computer-performed method of claim 6, wherein when the location of the tracer value is within an IMG tag, 
the format of the script-comprised value begins with a graphical file extension. 

20 8. The computer-performed method of claim 4, wherein when the location of the tracer value is within a script block, 
the format of the script-comprised value does not begin with a <SCRIPT> tag. 

9. The computer-performed method of claim 6, wherein when the location of the tracer value is within a comment 
field, the format of the script-comprised value begins with "-->". 



25 



45 



55 



10. The computer-performed method of claim 1 , further comprising: 



prior to performing the sub-method, determining whether the web site falls within a range of allowed web sites; 
and 

30 jf the web site does not fall within the range of allowed web sites, halting execution of the computer-performed 

method. 

1 1 . The computer-performed method of claim 1 , further comprising: 

35 prior to performing the sub-method, determining whether the computer performing the method is located on 

a home network; and 

if the computer performing the method is not located on the home network, halting execution of the computer- 
performed method. 

40 12. The computer-performed method of claim 1, wherein web site information from which the key-value pairs are 
determined is received via a clipboard of the computer's operating system. 

13. The computer-performed method of claim 1, wherein web site information from which the key-value pairs are 
determined is received via an input file comprising a listing of multiple web sites to be tested. 



14. The computer-performed method of claim 1 , further comprising: 



when the web site has no corresponding key-value pairs, submitting to the web site a third key-value pair, 
wherein the value of the third key-value pair comprises a Script that, if executed, signifies that the web site is 
50 vulnerable to a cross site scripting attack. 

15. The computer-performed method of claim 1 , wherein the key-value pairs are submitted via a form. 

16. The computer-performed method of claim 15, wherein the key-value pairs are submitted via a POST command. 

17. The computer performed method of claim 1 , wherein the key-value pairs are submitted via a URL. 

18. The computer-performed method of claim 1 7, wherein the key-value pairs are submitted via a GET method. 
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1 9. A computer-readable medium comprising computer readable instructions that, when executed, cause a computer 
to perform a method for automated detection of a cross site scripting vulnerability of a web site, comprising: 

determining key-value pairs corresponding to the web site; 
5 for each determined key-value pair; at least until a first vulnerability is detected, performing a sub-method 

comprising: 

submitting the key-value pair to the web site, wherein the value of the key-value pair comprises a tracer 
value; 

10 receiving a web page responsive to the submitted key-value pair; 

determining a location of the tracer value, when present, in the received web page; and 

when the tracer value is present in the received web page, submitting a second key-value pair to the web 

site, wherein the value of the second key-value pair comprises a script. 

15 20. The computer-readable medium of claim 1 9, wherein the script, if executed, signifies that the web site is vulnerable 
to a cross site scripting attack, and wherein a format of the script-comprised value of the second key-value pair is 
based on the determined location of the tracer value. 

21 . The computer-readable medium of claim 1 9, wherein the computer readable instructions further comprise writing 
20 vulnerability data corresponding to the web page to a log file, based on whether the script is executed. 

22. The computer-readable medium of claim 20, wherein the location of the tracer value is determined based on a 
document object model of the web page. 

25 23. The computer-readable medium of claim 22, wherein when the location of the tracer value is within displayed text 
of a body of the web page, the format of the script-comprised value begins with a script tag. 

24. The computer-readable medium of claim 22, wherein when the location of the tracer value is within an HTML tag, 
the format of the script-comprised value begins with indicia that closes the HTML tag. 

30 

25. The computer-readable medium of claim 24, wherein when the location of the tracer value is within an IMG tag, 
the format of the script-comprised value begins with a graphical file extension. 

26. The computer-readable medium of claim 22, wherein when the location of the tracer value is within a script block, 
35 the format of the script-comprised value does not begin with a <SCRIPT> tag. 

27. The computer-readable medium of claim 24, wherein when the location of the tracer value is within a comment 
field, the format of the script-comprised value begins with «--»>. 

40 28. The computer-readable medium of claim 1 9, wherein the computer readable instructions further comprise: 

prior to performing the sub-method, determining whether the web site falls within a range of allowed websites; 
and 

if the web site does not fall within the range of allowed web sites, halting execution of the computer-performed 
45 method. 

29. The computer-readable medium of claim 1 9, wherein the computer readable instructions further comprise: 

prior to performing the sub-method, determining whether the computer performing the method is located on 
50 a home network; and 

if the computer performing the method is not located on the home network, halting execution of the computer- 
performed method. 

30. The computer-readable medium of claim 19, wherein web site information from which the key-value pairs are 
55 determined is received via a clipboard of the computer's operating system. 

31. The computer-readable medium of claim 19, wherein web site information from which the key-value pairs are 
determined is received via an input file comprising a listing of multiple web sites to be tested. 
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32. The computer-readable medium of claim 1 9, further comprising: 

when the web site has no corresponding key-value pairs, submitting to the web site a third key-value pair, 
wherein the value of the third key-value pair comprises a script that, if executed, signifies that the web site is 
5 vulnerable to a cross site scripting attack. 

33. The computer-readable medium of claim 19, wherein the key-value pairs are submitted via a form. 

34. The computer-readable medium of claim 33, wherein the key-value pairs are submitted via a POST command. 

10 

35. The computer-readable medium of claim 19, wherein the key-value pairs are submitted via a URL. 

36. The computer-readable medium of claim 35, wherein the key-value pairs are submitted via a GET method. 
*5 37. A computer system comprising: 

a processor; and 

memory storing computer readable instructions that, when executed by the processor, cause the computer 
system to perform a method for automated detection of a cross site scripting vulnerability of a web site, com- 
20 prising: 

determining key-value pairs corresponding to the web site; 

for each determined key-value pair, at least until a first vulnerability is detected, performing a sub-method 
comprising: 

25 

submitting the key-value pair to the web site, wherein the value of the key-value pair comprises a 
tracer value; 

receiving a web page responsive to the submitted key-value pair; 
determining a location of the tracer value, when present, in the received web page; and 
30 when the tracer value is present in the received web page, submitting a second key-value pair to the 

web site, wherein the value of the second key-value; pair comprises a script. 

38. The computer system of claim 37, wherein the script, if executed, signifies that the web site is vulnerable to a cross 
site scripting attack, and wherein a format of the script-comprised value of the second key-value pair is based on 

35 the determined location of the tracer value. 

39. The computer system of claim 37, wherein the computer readable instructions further comprise writing vulnerability 
data corresponding to the web page to a log file, based on whether the script is executed. 

40 40. The computer system of claim 38, wherein the location of the tracer value is determined based on a document 
object model of the web page. 

41 . The computer system of claim 40, wherein when the location of the tracer value is within displayed text of a body 
of the web page, the format of the script-comprised value begins with a script tag. 

45 

42. The computer system of claim 40, wherein when the location of the tracer value is within an HTML tag, the format 
of the script-comprised value begins with indicia that closes the HTML tag. 

43. The computer system of claim 42, wherein when the location of the tracer value is within an IMG tag, the format 
50 of the script-comprised value begins with a graphical file extension. 

44. The computer system of claim 40, wherein when the location of the tracer value is within a script block, the format 
of the script-comprised value does not begin with a <SCRIPT> tag. 

55 45. The computer system of claim 42, wherein when the location of the tracer value is within a comment field, the 
format of the script-comprised value begins with "-->". 

46. The computer system of claim 37, wherein the computer readable instructions further comprise: 
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prior to performing the sub-method, determining whether the web site falls within a range of allowed web sites; 
and 

if the website does not fall within the range of allowed web sites, halting execution of the computer-performed 
method. 

47. The computer system of claim 37, wherein the computer readable instructions further comprise: 

prior to performing the sub-method, determining whether the computer performing the method is located on 
a home network; and 

if the computer performing the method is not located on the home network, halting execution of the computer- 
performed method. 

48. The computer system of claim 37, wherein web site information from which the key-value pairs are determined is 
received via a clipboard of the computer's operating system. 

49. The computer system of claim 37, wherein web site information from which the key-value pairs are determined is 
received via an input file comprising a listing of multiple web sites to be tested. 

50. The computer system of claim 37, further comprising: 

when the web site has no corresponding key-value pairs, submitting to the web site a third key-value pair, 
wherein the value of the third key-value pair comprises a script that, if executed, signifies that the web site is 
vulnerable to a cross site scripting attack: 

51 . The computer system of claim 37, wherein the key-value pairs are submitted via a form. 

52. The computer system of claim 51 , wherein the key-value pairs are submitted via a POST command. 

53. The computer system of claim 37, wherein the key-value pairs are submitted via a URL. 

54. The computer system of claim 53, wherein the key-value pairs are submitted via a GET method. 
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Http://www.goodguy.com/hello.asp?name=Tom 

FIG. 1 



<HTML> 
<HEAD> 

<TITLE>Hello Example</TITLE> 
</HEAD> 
<BODY> 
Hello, Tom. 
</BODY> 
</HTML> 

FIG. 2 



http://w\Aw.goodguyxom/hello.asp?name=<SCRIPT>some-evil- 
JavaScript-goes-here;</SCRIPT> 

FIG. 3 



<HTML> 
<HEAD> 

<TITLE> Hello Exa m p I e < /TITLE > 
</HEAD> 
<BODY> 

Hello, <SCRIPT>some-evil-JavaScript-goes-here;</ 
SCRIPT>. 

</BODY> 
</HTML> 

FIG. 4 
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